Search CORE

Scientific Publications of the University of Toulouse II Le Mirail

Implantation Not Only SQL des bases de données multidimensionnelles

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceLes systèmes NoSQL (Not Only SQL) se développent notamment grâce à leur capacité à gérer facilement de grands volumes de données, et leur flexibilité en terme de type de données. Dans cet article, nous étudions l'implantation d'un entrepôt de données multidimensionnelles avec un système NoSQL orienté documents. Nous proposons des règles de transformation qui permettent de passer d'un modèle conceptuel multidimensionnel vers un modèle logique NoSQL orienté documents. Nous proposons trois types de transformation pour implanter les entrepôts de données multidimensionnelles. Nous expérimentons ces trois approches avec le système MongoDB, et étudions le chargement des données, les processus de transformation d'un type d'implantation à un autre ainsi que le pré-calcul d'agrégats inhérents aux entrepôts de données multidimensionnelles

Scientific Publications of the University of Toulouse II Le Mirail

Benchmark for OLAP on NoSQL Technologies

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceThe plethora of data warehouse solutions has created a need comparing these solutions using experimental benchmarks. Existing benchmarks rely mostly on the relational data model and do not take into account other models. In this paper, we propose an extension to a popular benchmark (the Star Schema Benchmark or SSB) that considers non-relational NoSQL models. To avoid data post-processing required for using this data with NoSQL systems, the data is generated in different formats. To exploit at best horizontal scaling, data can be produced in a distributed file system, hence removing disk or partition sizes as limit for the generated dataset. Experimental work proves improved performance of our new benchmark

Scientific Publications of the University of Toulouse II Le Mirail

Apport du Web et du Web de Données pour la recherche d'attributs

Author: Abbes Rafik
Boughanem Mohand
Hernandez Nathalie,
Kopliku Arlind
Pinel-Sauvagnat Karen
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

National audienceNous nous intéressons dans cet article aux requêtes de type entité pour lesquelles on souhaite renvoyer un ensemble d’attributs (propriétés) et leurs valeurs. Ces attributs peuvent être collectés à partir de plusieurs sources et agrégés dans un seul document. Par exemple l’entité "France" peut avoir les attributs "Langue officielle: Français", "Villes:Paris, Toulouse, Lyon, ..." et "Population:65350000(en 2012)". Un attribut peut être monovalué ou multivalué, et peut éventuellement dépendre d’autres dimensions. Pour chercher les attributs d’une entité, nous avons exploité deux sources: les tables relationnelles du Web (issues du HTML) et le Web de Données. Afin d’évaluer le potentiel de ces sources, nous avons mis en place une évaluation utilisateur. Les analyses ont montré l’utilité de combiner ces deux sources pour répondre aux requêtes de type entité

Scientific Publications of the University of Toulouse II Le Mirail

Implementing Multidimensional Data Warehouses into NoSQL

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceNot only SQL (NoSQL) databases are becoming increasingly popular and have some interesting strengths such as scalability and flexibility. In this paper, we investigate on the use of NoSQL systems for implementing OLAP (On-Line Analytical Processing) systems. More precisely, we are interested in instantiating OLAP systems (from the conceptual level to the logical level) and instantiating an aggregation lattice (optimization). We define a set of rules to map star schemas into two NoSQL models: columnoriented and document-oriented. The experimental part is carried out using the reference benchmark TPC. Our experiments show that our rules can effectively instantiate such systems (star schema and lattice). We also analyze differences between the two NoSQL systems considered. In our experiments, HBase (columnoriented) happens to be faster than MongoDB (document-oriented) in terms of loading time

Scientific Publications of the University of Toulouse II Le Mirail

Entrepôts de données multidimensionnelles NoSQL

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceLes données des systèmes d'analyse en ligne (OLAP, On-Line Analytical Processing) sont traditionnellement gérées par des bases de données relationnelles. Malheureusement, il devient difficile de gérer des mégadonnées (de gros volumes de données, « Big Data »). Dans un tel contexte, comme alternative, les environnements « Not-Only SQL » (NoSQL) peuvent fournir un passage à l'échelle tout en gardant une certaine flexibilité pour un système OLAP. Nous définissons ainsi des règles pour convertir un schéma en étoile, ainsi que son optimisation, le treillis d'agrégats pré-calculés, en deux modèles logiques NoSQL : orienté-colonnes ou orienté-documents. En utilisant ces règles, nous implémentons et analysons deux systèmes décisionnels, un par modèle, avec MongoDB et HBase. Nous comparons ces derniers sur les phases de chargement des données (générées avec le benchmark TPC-DS), de calcul d'un treillis et d'interrogation

Scientific Publications of the University of Toulouse II Le Mirail

Implementation of multidimensional databases in column-oriented NoSQL systems

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceNoSQL (Not Only SQL) systems are becoming popular due to known advantages such as horizontal scalability and elasticity. In this paper, we study the implementation of multidimensional data warehouses with columnoriented NoSQL systems. We define mapping rules that transform the conceptual multidimensional data model to logical column-oriented models. We consider three different logical models and we use them to instantiate data warehouses. We focus on data loading, model-to-model conversion and OLAP cuboid computation

Scientific Publications of the University of Toulouse II Le Mirail

Document-oriented data warehouses : complex hierarchies and summarizability

Author: Chevalier Max
El Malki Mohammed
Kopliku Arlind
Teste Olivier
Tournier Ronan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

There is an increasing interest in implementing data warehouses with NoSQL document-oriented systems. In the ideal case, data can be analysed on different dimensions. These dimensions follow strict hierarchies that we can use to roll-up and drill-down on analysis axes. In this paper, we deal with non-strict and non-covering hierarchies, common issues in data warehousing a.k.a. summarizability issues. We show how to model these hierarchies in document-oriented systems and we propose an algorithm that can deal with summarizability issues. The new approach is tested and compared to existing approaches

Crossref